1) 18S amplicons

1.1) Data overview

Let’s read the dataset and remove the samples containing less than 8522 reads:

## [1] 47144   130
##   OTUId st001 st002 st003 st004
## 1 OTU_1   469   187   240   135
## 2 OTU_2   445   389   760   819
## 3 OTU_3  1030  2196  1611   453
## 4 OTU_4   135    66    36   365
## 5 OTU_5     0   178   285   175
## Warning in `[<-.factor`(`*tmp*`, thisvar, value = "NA"): invalid factor
## level, NA generated
## Warning in `[<-.factor`(`*tmp*`, thisvar, value = "NA"): invalid factor
## level, NA generated
## Warning in `[<-.factor`(`*tmp*`, thisvar, value = "NA"): invalid factor
## level, NA generated
## Warning in `[<-.factor`(`*tmp*`, thisvar, value = "NA"): invalid factor
## level, NA generated
## Warning in `[<-.factor`(`*tmp*`, thisvar, value = "NA"): invalid factor
## level, NA generated
## Warning in `[<-.factor`(`*tmp*`, thisvar, value = "NA"): invalid factor
## level, NA generated
## Warning in `[<-.factor`(`*tmp*`, thisvar, value = "NA"): invalid factor
## level, NA generated
## [1] 47144   129
##       st001 st002 st003 st004 st005
## OTU_1   469   187   240   135   184
## OTU_2   445   389   760   819  1003
## OTU_3  1030  2196  1611   453   512
## OTU_4   135    66    36   365   146
## OTU_5     0   178   285   175    49
## [1] 47144     7
##                                                                                                                          SILVA_taxonomic_classification
## OTU_1         KC488417.1.1688_Eukaryota;SAR;Alveolata;Dinoflagellata;Dinophyceae;Gymnodiniphycidae;Gymnodinium_clade;FV18-2D9;uncultured_dinoflagellate
## OTU_2                              FJ832119.1.1585_Eukaryota;SAR;Alveolata;Protalveolata;Syndiniales;Syndiniales_Group_I;uncultured_marine_picoplankton
## OTU_3                                       EF172960.1.1674_Eukaryota;SAR;Alveolata;Protalveolata;Syndiniales;Syndiniales_Group_II;uncultured_eukaryote
## OTU_4 KC488405.1.1558_Eukaryota;SAR;Alveolata;Dinoflagellata;Dinophyceae;Gymnodiniphycidae;Gymnodinium_clade;Erythropsidinium;uncultured_dinoflagellate
## OTU_5                                      EU793901.1.1663_Eukaryota;SAR;Alveolata;Protalveolata;Syndiniales;Syndiniales_Group_I;uncultured_syndiniales
##         MAS_taxonomic_classification   BM_taxonomic_classification
## OTU_1 EU793855_Dinophyceae_Alveolata 9492_Dinophyceae_Alveolata_15
## OTU_2      EU793773_MALV-I_Alveolata     3261_MALV-I_Alveolata_385
## OTU_3     EU793316_MALV-II_Alveolata                          <NA>
## OTU_4 EU780636_Dinophyceae_Alveolata 21474_Dinophyceae_Alveolata_6
## OTU_5      EU818388_MALV-I_Alveolata      398_MALV-I_Alveolata_395
##       SILVA_class   MAS_class    BM_class MAS_plus_BM_plus_SILVA_class
## OTU_1 Dinophyceae Dinophyceae Dinophyceae                  Dinophyceae
## OTU_2        <NA>        <NA>        <NA>                         <NA>
## OTU_3        <NA>        <NA>        <NA>                         <NA>
## OTU_4 Dinophyceae Dinophyceae Dinophyceae                  Dinophyceae
## OTU_5        <NA>        <NA>        <NA>                         <NA>
## [1] 47144   122
##       st001 st002 st003 st004 st005
## OTU_1   469   187   240   135   184
## OTU_2   445   389   760   819  1003
## OTU_3  1030  2196  1611   453   512
## OTU_4   135    66    36   365   146
## OTU_5     0   178   285   175    49
## st054 
##  4050
## [1] 47144   121
## [1] 47144   117
## [1] 47144     0

Table dimensions and content outline:

## [1] 47144   117
##       st001 st002 st003 st004 st005
## OTU_1   469   187   240   135   184
## OTU_2   445   389   760   819  1003
## OTU_3  1030  2196  1611   453   512
## OTU_4   135    66    36   365   146
## OTU_5     0   178   285   175    49

Minimum number of reads per station:

min(colSums(tb18_tax_occur_min8522)) 
## [1] 8522

Maximum number of reads per station:

max(colSums(tb18_tax_occur_min8522)) 
## [1] 936570

Identification of station with higher number of reads:

amplicons_per_sample<-colSums(tb18_tax_occur_min8522)
amplicons_per_sample[which(colSums(tb18_tax_occur_min8522)>900000)]
##  st131 
## 936570

Overall reads per sample:

1.2) Normalization

Let’s normalize the original dataset by randomly subsampling 8522 reads in each station:

library(vegan)
## Loading required package: permute
## Loading required package: lattice
## This is vegan 2.3-1
tb18_tax_occur_min8522_t<-t(tb18_tax_occur_min8522)
tb18_tax_occur_ss8522<-rrarefy(tb18_tax_occur_min8522_t, 8522)

The normalized table shows the following dimensions and format:

## [1]   117 47144
##       OTU_1 OTU_2 OTU_3 OTU_4 OTU_5
## st001    26    25    49     8     0
## st002    23    44   241     3    24
## st003    34   131   237     2    48
## st004    37   189    94   102    31
## st005    24   141    76    32     8

Its content fits with the expected normalization values (8522 reads per station):

rowSums(tb18_tax_occur_ss8522)
## st001 st002 st003 st004 st005 st007 st009 st010 st012 st013 st014 st015 
##  8522  8522  8522  8522  8522  8522  8522  8522  8522  8522  8522  8522 
## st017 st018 st019 st022 st023 st024 st025 st026 st027 st028 st029 st030 
##  8522  8522  8522  8522  8522  8522  8522  8522  8522  8522  8522  8522 
## st032 st033 st034 st035 st037 st038 st039 st040 st041 st043 st044 st045 
##  8522  8522  8522  8522  8522  8522  8522  8522  8522  8522  8522  8522 
## st046 st047 st049 st050 st051 st052 st053 st055 st056 st057 st058 st059 
##  8522  8522  8522  8522  8522  8522  8522  8522  8522  8522  8522  8522 
## st060 st062 st063 st064 st065 st066 st067 st068 st069 st070 st071 st072 
##  8522  8522  8522  8522  8522  8522  8522  8522  8522  8522  8522  8522 
## st073 st074 st075 st076 st077 st078 st079 st081 st082 st083 st085 st086 
##  8522  8522  8522  8522  8522  8522  8522  8522  8522  8522  8522  8522 
## st087 st088 st089 st092 st093 st094 st095 st096 st097 st098 st101 st102 
##  8522  8522  8522  8522  8522  8522  8522  8522  8522  8522  8522  8522 
## st103 st104 st106 st107 st108 st109 st110 st112 st114 st115 st116 st118 
##  8522  8522  8522  8522  8522  8522  8522  8522  8522  8522  8522  8522 
## st119 st120 st121 st125 st126 st127 st128 st129 st130 st131 st132 st133 
##  8522  8522  8522  8522  8522  8522  8522  8522  8522  8522  8522  8522 
## st134 st135 st138 st139 st140 st141 st143 st145 st146 
##  8522  8522  8522  8522  8522  8522  8522  8522  8522

Let’s check out how many OTUs don’t appear in the new table:

length(which(colSums(tb18_tax_occur_ss8522)==0)) 
## [1] 20431

There are 20487 OTUs that don’t show any occurrence in the normalized data. Let’s remove them from the table and take a look at its final dimensions:

tb18_tax_occur_ss8522_no_cero<-tb18_tax_occur_ss8522[,-(which(colSums(tb18_tax_occur_ss8522)==0))]
dim(tb18_tax_occur_ss8522_no_cero)
## [1]   117 26713

Datasets summary: dim(tb18_tax) –> 47144 129 dim(tb18_tax_occur) –> 47144 122 dim(tb18_tax_occur_ss8522_no_cero) –> 117 26657

1.3) General community analysis

1.3.1) Richness and evenness (Shannon index)

Most of the samples take Shannon Index values between 5.5 and 6:

1.3.2) Richness: OTU number

Lowest number of OTUs per sample:

## [1] 168

Maximum number of OTUs per sample:

## [1] 1986

In most of the samples, we can identify between 1000 and 1300 OTUs:

plot(sort(OTUs_per_sample_18S_tax_occur_ss8522), pch=19)

boxplot(OTUs_per_sample_18S_tax_occur_ss8522, pch=19)

1.3.3) Index of evenness

1.3.3.1) Pielou’s index

pielou_evenness_18S_tax_occur_ss8522 <- tb18_tax_occur_ss8522_div/log(OTUs_per_sample_18S_tax_occur_ss8522)

The Pielou index (constrained between 0 and 1) takes values closer to 1 as the variation of species proportion in a sample decreases. Most of the samples get values between 0.8 and 0.9, meaning that the numerical composition of different OTUs in a sample is highly similar:

plot(sort(pielou_evenness_18S_tax_occur_ss8522), pch=19)

boxplot(pielou_evenness_18S_tax_occur_ss8522, pch=19)

The OTU_2, with 27292 reads, is the most abundant in the overall dataset:

head(sort(colSums(tb18_tax_occur_ss8522_no_cero), decreasing=T), n=10L)
##  OTU_2  OTU_6 OTU_23  OTU_8 OTU_25 OTU_27  OTU_1 OTU_12  OTU_9  OTU_7 
##  48439  35256  12264  11760  10044   9850   8568   8022   7937   7600

Most of the OTUs show very few occurrences; the plot suggests that we will probably be able to identify a significant ammount of rare otus:

plot(log(sort(colSums(tb18_tax_occur_ss8522_no_cero), decreasing=T)), pch=19)

1.3.4) Abundance Models

1.3.4.1) Rank-Abundance or Dominance/Diversity Model (“radfit”)

The OTUs abundance distribution fits relativelly close to log-normal model.

1.3.4.2) Preston’s Lognormal Model

According to Preston’s lognormal model fit into species frequencies groups, we’re missing ~3226 species:

tb18_tax_occur_ss8522_prestonfit<-prestonfit(colSums(tb18_tax_occur_min8522_t))
plot(tb18_tax_occur_ss8522_prestonfit, main="Pooled species")

veiledspec(tb18_tax_occur_ss8522_prestonfit)
## Extrapolated     Observed       Veiled 
##    49081.293    46593.000     2488.293

When computing Prestons’ lognormal model fit without pooling data into groups, we seem to miss ~3055 species:

## Extrapolated     Observed       Veiled 
##    48957.844    46593.000     2364.844

1.3.6) Beta diversity

1.3.6.1) Dissimilarity matrix using Bray-Curtis index:

The Bray-Curtis dissimilarity, constrained between 0 (minimum distance) and 1 (highest dissimilarity) allows us to quantify the differences between samples according to the composition and relative abundance of their OTUs. In our dataset, most of the samples pairs take dissimilarity values between between 0.7 and 0.8, meaning that their composition is substantially different.

1.3.6.2) Hierarchical clustering

The only relatively evident cluster we can distinguish in the dendogram stands out in the very left side of the plot.

(To be done: assign Longhurst provinces information to each station and check if any of the central clusters is meaningful regarding to the samples’ geographical ubication)

1.3.6.3) Non-metric multidimensional scaling

We can identify a prominent group in the central part of the NMDS plot and a few outliers in the middle-left edge of the plot. The stress parameter takes a value below 0.3, suggesting that the plot is acceptable.

## 
## Call:
## monoMDS(dist = tb18_tax_occur_ss8522_no_cero.bray) 
## 
## Non-metric Multidimensional Scaling
## 
## 117 points, dissimilarity 'bray', call 'vegdist(x = tb18_tax_occur_ss8522_no_cero, method = "bray")'
## 
## Dimensions: 2 
## Stress:     0.2007517 
## Stress type 1, weak ties
## Scores scaled to unit root mean square, rotated to principal components
## Stopped after 91 iterations: Stress nearly unchanged (ratio > sratmax)

When implementing a most robut function for computing NMDS plots, the result is quiet the same:

## Run 0 stress 0.2006672 
## Run 1 stress 0.2006664 
## ... New best solution
## ... procrustes: rmse 0.001004021  max resid 0.007828648 
## *** Solution reached
## Warning in ordiplot(x, choices = choices, type = type, display = display, :
## Species scores not available

1.4) Geographical analysis

Working datasets:

  1. Community matrix: tb18_tax_occur_ss8522_no_cero
dim(tb18_tax_occur_ss8522_no_cero)
## [1]   117 26713
tb18_tax_occur_ss8522_no_cero[1:5, 1:5]
##       OTU_1 OTU_2 OTU_3 OTU_4 OTU_5
## st001    26    25    49     8     0
## st002    23    44   241     3    24
## st003    34   131   237     2    48
## st004    37   189    94   102    31
## st005    24   141    76    32     8
  1. Community Bray-Curtis: tb18_tax_occur_ss8522_no_cero.bray
## NULL
tb18_tax_occur_ss8522_no_cero.bray<-as.matrix(tb18_tax_occur_ss8522_no_cero.bray)
  1. Stations distances in km: geo_distances_MP_18S
dim(geo_distances_MP_18S)
## [1] 117 117

Communities quickly change their composition across geographical distances:

plot(geo_distances_MP_18S, tb18_tax_occur_ss8522_no_cero.bray, pch=19, cex=0.4, xlab="Geopgraphical distances", ylab="Bray-Curtis dissimilarities")

1.4.1) Mantel correlograms

Mantel statistic is -significantlly- so low, meaning that the correlation between samples dissimilarity and geographical distances is weak.

mantel(geo_distances_MP_18S, tb18_tax_occur_ss8522_no_cero.bray)
## 
## Mantel statistic based on Pearson's product-moment correlation 
## 
## Call:
## mantel(xdis = geo_distances_MP_18S, ydis = tb18_tax_occur_ss8522_no_cero.bray) 
## 
## Mantel statistic r: 0.1172 
##       Significance: 0.001 
## 
## Upper quantiles of permutations (null model):
##    90%    95%  97.5%    99% 
## 0.0232 0.0296 0.0377 0.0447 
## Permutation: free
## Number of permutations: 999

Maximum distance between samples:

## [1] 19500.19

Minimum distance between samples:

## [1] 0

Correlograms:

MP_18s_ss8522_mantel_correl_by_1000km<-mantel.correlog(tb18_tax_occur_ss8522_no_cero.bray, D.geo=geo_distances_MP_18S, break.pts=seq(0,20000, by=1000))
plot(MP_18s_ss8522_mantel_correl_by_1000km)

MP_18s_ss8522_mantel_correl_by_100km<-mantel.correlog(tb18_tax_occur_ss8522_no_cero.bray, D.geo=geo_distances_MP_18S, break.pts=seq(0,20000, by=100))
plot(MP_18s_ss8522_mantel_correl_by_100km)

1.5) Abundance vs. occurence

In the following plot, we can appreciate the OTUs distribution according to their percentage of occurence and relative abundance. The red line keeps up OTUs that occur in more than 80% of the samples, the green line limits regionally rare OTUs (< 0.001%), and the blue one restricts regionally abundant OTUs (> 0.1%).

Regionally abundant OTUs (relative abundance over 0.1%):

##           mean_rabund perc_occur
## OTU_2     0.048581148  100.00000
## OTU_6     0.035359462   98.29060
## OTU_23    0.012299990   89.74359
## OTU_8     0.011794511   93.16239
## OTU_25    0.010073475   97.43590
## OTU_27    0.009878906   99.14530
## OTU_1     0.008593144  100.00000
## OTU_12    0.008045541   99.14530
## OTU_9     0.007960292   64.10256
## OTU_7     0.007622303   97.43590
## OTU_14696 0.007580180   83.76068
## OTU_13    0.007133874   95.72650
## OTU_4     0.006976413   98.29060
## OTU_20    0.006860073   49.57265
## OTU_11    0.006806917   97.43590
## OTU_24    0.006582260   95.72650
## OTU_18    0.006405743   99.14530
## OTU_3     0.005555255   80.34188
## OTU_34    0.005196204   52.13675
## OTU_5     0.004919394   77.77778
## OTU_119   0.004804057   19.65812
## OTU_21    0.004780989   86.32479
## OTU_22    0.004643587   79.48718
## OTU_26    0.004545300   88.88889
## OTU_45    0.004444003   98.29060
## OTU_10    0.004252443   82.05128
## OTU_64    0.004210319  100.00000
## OTU_19    0.004174214   95.72650
## OTU_17    0.004143123   88.88889
## OTU_40    0.004006724   94.87179
## OTU_58    0.004004718   61.53846
## OTU_70    0.003977639   67.52137
## OTU_36    0.003915457   85.47009
## OTU_106   0.003816166   77.77778
## OTU_50    0.003762008   88.03419
## OTU_28    0.003607556   87.17949
## OTU_44    0.003544371   40.17094
## OTU_14    0.003438060   86.32479
## OTU_5618  0.003410981   88.03419
## OTU_55    0.003296646   97.43590
## OTU_31    0.003216411   93.16239
## OTU_42    0.003177297   92.30769
## OTU_15    0.003155232   39.31624
## OTU_81    0.003112106   90.59829
## OTU_38    0.003099068   85.47009
## OTU_32    0.003073995   64.95726
## OTU_39    0.003058950   99.14530
## OTU_51    0.003056945   84.61538
## OTU_30    0.003013818   56.41026
## OTU_47    0.002972698   62.39316
## OTU_80    0.002937595   58.11966
## OTU_33    0.002870399   94.01709
## OTU_767   0.002864381   24.78632
## OTU_35799 0.002829278   95.72650
## OTU_48    0.002829278   82.90598
## OTU_41    0.002816240   87.17949
## OTU_142   0.002807214   69.23077
## OTU_72    0.002806211   94.87179
## OTU_65    0.002804205   93.16239
## OTU_37    0.002761079   90.59829
## OTU_46    0.002755061   25.64103
## OTU_84    0.002724973   58.97436
## OTU_29    0.002699900   95.72650
## OTU_52    0.002675829   76.92308
## OTU_75    0.002624680   93.16239
## OTU_35    0.002588574   79.48718
## OTU_49    0.002573530   95.72650
## OTU_56    0.002547454   52.99145
## OTU_71    0.002504328   95.72650
## OTU_74    0.002477248   87.17949
## OTU_43    0.002406040   60.68376
## OTU_62    0.002397014   74.35897
## OTU_85    0.002348873   82.05128
## OTU_122   0.002346867   55.55556
## OTU_66    0.002274656   92.30769
## OTU_101   0.002262620   74.35897
## OTU_137   0.002255600   55.55556
## OTU_68    0.002226515   90.59829
## OTU_16    0.002212474   23.07692
## OTU_76    0.002211471   81.19658
## OTU_69    0.002208462   50.42735
## OTU_54    0.002168345   84.61538
## OTU_7889  0.002162327   66.66667
## OTU_102   0.002134245   68.37607
## OTU_11454 0.002108169   97.43590
## OTU_78    0.002087107   93.16239
## OTU_6315  0.002086104   82.90598
## OTU_73    0.002000855   64.95726
## OTU_92    0.001981799   66.66667
## OTU_82    0.001969764   76.92308
## OTU_89    0.001891535   57.26496
## OTU_87    0.001822332   82.90598
## OTU_96    0.001811300   58.97436
## OTU_57    0.001778203   86.32479
## OTU_126   0.001746109   94.01709
## OTU_124   0.001727053   73.50427
## OTU_53    0.001714015   24.78632
## OTU_109   0.001666877   54.70085
## OTU_59    0.001639798   70.08547
## OTU_3988  0.001636789   69.23077
## OTU_90    0.001628766   91.45299
## OTU_115   0.001620742   38.46154
## OTU_116   0.001612719   68.37607
## OTU_61    0.001593663   75.21368
## OTU_67    0.001590654   91.45299
## OTU_121   0.001562572   93.16239
## OTU_110   0.001542513   76.92308
## OTU_203   0.001542513   17.94872
## OTU_98    0.001535493   74.35897
## OTU_79    0.001509417   56.41026
## OTU_83    0.001469299   62.39316
## OTU_77    0.001456261   55.55556
## OTU_88    0.001451246   86.32479
## OTU_104   0.001437205   68.37607
## OTU_95    0.001428179   72.64957
## OTU_1882  0.001421158   59.82906
## OTU_112   0.001404108   84.61538
## OTU_103   0.001398091   78.63248
## OTU_99    0.001390067   59.82906
## OTU_63    0.001349950   41.02564
## OTU_108   0.001343932   60.68376
## OTU_243   0.001324877   94.01709
## OTU_1136  0.001319862   91.45299
## OTU_135   0.001294789   79.48718
## OTU_94    0.001286765   89.74359
## OTU_162   0.001285762   62.39316
## OTU_177   0.001268712   85.47009
## OTU_35494 0.001260689   58.97436
## OTU_111   0.001259686   88.03419
## OTU_128   0.001254671   77.77778
## OTU_91    0.001250659   63.24786
## OTU_174   0.001243639   54.70085
## OTU_188   0.001239627   91.45299
## OTU_146   0.001212548   61.53846
## OTU_192   0.001206530   82.90598
## OTU_120   0.001196501   60.68376
## OTU_125   0.001194495   86.32479
## OTU_113   0.001186472   63.24786
## OTU_132   0.001175439   45.29915
## OTU_105   0.001168419   39.31624
## OTU_131   0.001158389   89.74359
## OTU_145   0.001152372   74.35897
## OTU_93    0.001141340   73.50427
## OTU_165   0.001141340   68.37607
## OTU_152   0.001128301   67.52137
## OTU_157   0.001120278   29.05983
## OTU_166   0.001107240   57.26496
## OTU_141   0.001099216   55.55556
## OTU_11833 0.001077152   76.06838
## OTU_338   0.001069128   82.90598
## OTU_1819  0.001066119   48.71795
## OTU_180   0.001064114   57.26496
## OTU_133   0.001058096   50.42735
## OTU_163   0.001056090   61.53846
## OTU_140   0.001042049   71.79487
## OTU_151   0.001040043   57.26496
## OTU_7573  0.001031017   82.90598
## OTU_235   0.001030014   71.79487
## OTU_136   0.001023996   44.44444
## OTU_130   0.001015973   88.03419
## OTU_129   0.001015973   54.70085
## OTU_123   0.001009955   61.53846
dim(tb18_ss8522_abundant_sorted)
## [1] 162   2

Proportion of regionally abundant OTUs (%):

## [1] 0.6430868

Cosmopolitan OTUs (relative abundance over 0.1% and occurence in more than 80% of samples):

##           mean_rabund perc_occur
## OTU_2     0.048581148  100.00000
## OTU_1     0.008593144  100.00000
## OTU_64    0.004210319  100.00000
## OTU_27    0.009878906   99.14530
## OTU_12    0.008045541   99.14530
## OTU_18    0.006405743   99.14530
## OTU_39    0.003058950   99.14530
## OTU_6     0.035359462   98.29060
## OTU_4     0.006976413   98.29060
## OTU_45    0.004444003   98.29060
## OTU_25    0.010073475   97.43590
## OTU_7     0.007622303   97.43590
## OTU_11    0.006806917   97.43590
## OTU_55    0.003296646   97.43590
## OTU_11454 0.002108169   97.43590
## OTU_13    0.007133874   95.72650
## OTU_24    0.006582260   95.72650
## OTU_19    0.004174214   95.72650
## OTU_35799 0.002829278   95.72650
## OTU_29    0.002699900   95.72650
## OTU_49    0.002573530   95.72650
## OTU_71    0.002504328   95.72650
## OTU_40    0.004006724   94.87179
## OTU_72    0.002806211   94.87179
## OTU_33    0.002870399   94.01709
## OTU_126   0.001746109   94.01709
## OTU_243   0.001324877   94.01709
## OTU_8     0.011794511   93.16239
## OTU_31    0.003216411   93.16239
## OTU_65    0.002804205   93.16239
## OTU_75    0.002624680   93.16239
## OTU_78    0.002087107   93.16239
## OTU_121   0.001562572   93.16239
## OTU_42    0.003177297   92.30769
## OTU_66    0.002274656   92.30769
## OTU_90    0.001628766   91.45299
## OTU_67    0.001590654   91.45299
## OTU_1136  0.001319862   91.45299
## OTU_188   0.001239627   91.45299
## OTU_81    0.003112106   90.59829
## OTU_37    0.002761079   90.59829
## OTU_68    0.002226515   90.59829
## OTU_23    0.012299990   89.74359
## OTU_94    0.001286765   89.74359
## OTU_131   0.001158389   89.74359
## OTU_26    0.004545300   88.88889
## OTU_17    0.004143123   88.88889
## OTU_50    0.003762008   88.03419
## OTU_5618  0.003410981   88.03419
## OTU_111   0.001259686   88.03419
## OTU_130   0.001015973   88.03419
## OTU_28    0.003607556   87.17949
## OTU_41    0.002816240   87.17949
## OTU_74    0.002477248   87.17949
## OTU_21    0.004780989   86.32479
## OTU_14    0.003438060   86.32479
## OTU_57    0.001778203   86.32479
## OTU_88    0.001451246   86.32479
## OTU_125   0.001194495   86.32479
## OTU_36    0.003915457   85.47009
## OTU_38    0.003099068   85.47009
## OTU_177   0.001268712   85.47009
## OTU_51    0.003056945   84.61538
## OTU_54    0.002168345   84.61538
## OTU_112   0.001404108   84.61538
## OTU_14696 0.007580180   83.76068
## OTU_48    0.002829278   82.90598
## OTU_6315  0.002086104   82.90598
## OTU_87    0.001822332   82.90598
## OTU_192   0.001206530   82.90598
## OTU_338   0.001069128   82.90598
## OTU_7573  0.001031017   82.90598
## OTU_10    0.004252443   82.05128
## OTU_85    0.002348873   82.05128
## OTU_76    0.002211471   81.19658
## OTU_3     0.005555255   80.34188
dim(otu_tb18_ss8522_cosmop_sorted)
## [1] 76  2

Proportion of cosmopolitan OTUs (%):

## [1] 0.254059

Number and proportion (%) of rare OTUs:

dim(otu_tb18_ss8522_rabund_percoccur[otu_tb18_ss8522_rabund_percoccur$mean_rabund < 0.00001 & otu_tb18_ss8522_rabund_percoccur$mean_rabund >0,])
## [1] 20730     2
## [1] 73.64138

1.6) Taxonomic composition analysis

1.6.1) Normalized data

1.6.1.1) Absolute values

Let’s add the taxonomic classification to the left OTUs by merging “tb18_tax_occur_ss8522_no_cero” with “tb18_tax”:

## [1] 26713   125
##   Row.names st001 st002 st003 st004
## 1     OTU_1    26    23    34    37
## 2    OTU_10     2     9    38    40
## 3   OTU_100    13    30     4     6
## 4  OTU_1000     0     0     1     1
## 5 OTU_10000     0     0     0     1
## [1] 26713   124
##           st001 st002 st003 st004 st005
## OTU_1        26    23    34    37    24
## OTU_10        2     9    38    40    11
## OTU_100      13    30     4     6     8
## OTU_1000      0     0     1     1     0
## OTU_10000     0     0     0     1     0
## [1] 26713   125
##       st001 st002 st003 st004 st005
## OTU_1    26    23    34    37    24
## OTU_2    25    44   131   189   141
## OTU_3    49   241   237    94    76
## OTU_4     8     3     2   102    32
## OTU_5     0    24    48    31     8
## [1] 7727  125
##        st001 st002 st003 st004 st005
## OTU_1     26    23    34    37    24
## OTU_4      8     3     2   102    32
## OTU_18    53    49    57    78    67
## OTU_25   113   106   119   147   205
## OTU_33    31    23    28    11    32
##        st001 st002 st003 st004 st005
## OTU_1     26    23    34    37    24
## OTU_4      8     3     2   102    32
## OTU_18    53    49    57    78    67
## OTU_25   113   106   119   147   205
## OTU_33    31    23    28    11    32
## [1] 6451  117
## [1] 117
## [1] 117
## [1] 117
## [1] 111
## [1] 116
## [1] 95
## [1] 90
## [1] 97
## [1] 95
## [1] 89
## [1] 47
## [1] 32
## [1] 11
## [1] 25
## [1] 19
## [1] 27
## [1] 6
##               Group.1      x
## 8         Dinophyceae 137924
## 13     Prasinophyceae  20175
## 5       Chrysophyceae  16440
## 11      Pelagophyceae   9270
## 7    Dictyochophyceae   8084
## 6     Cryptomonadales   2110
## 1     Bacillariophyta    765
## 3  Chlorarachniophyta    728
## 2       Bolidophyceae    573
## 12  Pinguiochysidales    301
## 14   Prymnesiophyceae    100
## 10    Mamiellophyceae     77
## 15     Raphydophyceae     41
## 4       Chlorophyceae     40
## 9       Eustigmatales     40
## 17        Ulvophyceae     30
## 16   Trebouxiophyceae     25
##               Group.1    x
## 8         Dinophyceae 6451
## 13     Prasinophyceae  300
## 5       Chrysophyceae  257
## 11      Pelagophyceae  225
## 7    Dictyochophyceae  172
## 1     Bacillariophyta   68
## 3  Chlorarachniophyta   57
## 6     Cryptomonadales   55
## 14   Prymnesiophyceae   37
## 10    Mamiellophyceae   26
## 4       Chlorophyceae   24
## 2       Bolidophyceae   15
## 12  Pinguiochysidales   13
## 17        Ulvophyceae   13
## 9       Eustigmatales    6
## 16   Trebouxiophyceae    5
## 15     Raphydophyceae    3
##                    reads_per_class OTUs_per_class
## Bacillariophyta                765             68
## Bolidophyceae                  573             15
## Chlorarachniophyta             728             57
## Chlorophyceae                   40             24
## Chrysophyceae                16440            257
##                    reads_per_class OTUs_per_class samples_per_class
## Dinophyceae                 137924           6451               117
## Prasinophyceae               20175            300               117
## Chrysophyceae                16440            257               117
## Pelagophyceae                 9270            225               111
## Dictyochophyceae              8084            172               116
## Cryptomonadales               2110             55                95
## Bacillariophyta                765             68                90
## Chlorarachniophyta             728             57                97
## Bolidophyceae                  573             15                95
## Pinguiochysidales              301             13                89
## Prymnesiophyceae               100             37                47
## Mamiellophyceae                 77             26                32
## Raphydophyceae                  41              3                27
## Chlorophyceae                   40             24                25
## Eustigmatales                   40              6                11
## Ulvophyceae                     30             13                19
## Trebouxiophyceae                25              5                 6

OTUs per class vs. samples in which they occur [PLOT DESCRIPTION]

Reads per class vs. OTUs per class [PLOT DESCRIPTION]

## Warning in trans$transform(limits): NaNs produced

## Warning in trans$transform(limits): NaNs produced
## Warning in trans$transform(limits): NaNs produced

1.6.1.1) Relative values

##   reads_per_class    OTUs_per_class samples_per_class 
##           100.000           100.000          1035.043
##                    reads_per_class OTUs_per_class samples_per_class
## Dinophyceae            70.11076488    83.48647599        100.000000
## Prasinophyceae         10.25553697     3.88248997        100.000000
## Chrysophyceae           8.35692827     3.32599974        100.000000
## Pelagophyceae           4.71220955     2.91186748         94.871795
## Dictyochophyceae        4.10933139     2.22596092         99.145299
## Cryptomonadales         1.07257413     0.71178983         81.196581
## Bacillariophyta         0.38887166     0.88003106         76.923077
## Chlorarachniophyta      0.37006349     0.73767309         82.905983
## Bolidophyceae           0.29127250     0.19412450         81.196581
## Pinguiochysidales       0.15300702     0.16824123         76.068376
## Prymnesiophyceae        0.05083290     0.47884043         40.170940
## Mamiellophyceae         0.03914133     0.33648246         27.350427
## Raphydophyceae          0.02084149     0.03882490         23.076923
## Chlorophyceae           0.02033316     0.31059920         21.367521
## Eustigmatales           0.02033316     0.07764980          9.401709
## Ulvophyceae             0.01524987     0.16824123         16.239316
## Trebouxiophyceae        0.01270822     0.06470817          5.128205

OTUs per class vs. samples in which they occur [PLOT DESCRIPTION]

## Warning in trans$transform(value): NaNs produced

Reads per class vs. OTUs per class [PLOT DESCRIPTION]

## Warning in trans$transform(limits): NaNs produced

## Warning in trans$transform(limits): NaNs produced
## Warning in trans$transform(value): NaNs produced

## Warning in trans$transform(limits): NaNs produced
## Warning in trans$transform(limits): NaNs produced

scale_y_continuous(limits = c(0, upper.lim), expand = c(0.01, 0)))
## Warning in trans$transform(limits): NaNs produced

1.6.2) Non-normalized data

## [1] 47144   125
##   Row.names st001 st002 st003 st004
## 1     OTU_1   469   187   240   135
## 2    OTU_10    10    72   173   176
## 3   OTU_100   264   325    25    28
## 4  OTU_1000     4     1     8     5
## 5 OTU_10000     4     6     5     2
## [1] 47144   124
##           st001 st002 st003 st004 st005
## OTU_1       469   187   240   135   184
## OTU_10       10    72   173   176    84
## OTU_100     264   325    25    28    58
## OTU_1000      4     1     8     5     0
## OTU_10000     4     6     5     2     0
## [1] 47144   125
##       st001 st002 st003 st004 st005
## OTU_1   469   187   240   135   184
## OTU_2   445   389   760   819  1003
## OTU_3  1030  2196  1611   453   512
## OTU_4   135    66    36   365   146
## OTU_5     0   178   285   175    49
## [1] 14416   125
##        st001 st002 st003 st004 st005
## OTU_1    469   187   240   135   184
## OTU_4    135    66    36   365   146
## OTU_18   908   383   349   349   508
## OTU_25  2214  1138   716   606  1386
## OTU_33   496   166   141    60   242
##        st001 st002 st003 st004 st005
## OTU_1    469   187   240   135   184
## OTU_4    135    66    36   365   146
## OTU_18   908   383   349   349   508
## OTU_25  2214  1138   716   606  1386
## OTU_33   496   166   141    60   242
## [1] 12117   117
## [1] 117
## [1] 117
## [1] 117
## [1] 117
## [1] 113
## [1] 116
## [1] 103
## [1] 112
## [1] 112
## [1] 108
## [1] 111
## [1] 97
## [1] 65
## [1] 35
## [1] 75
## [1] 50
## [1] 75
## [1] 18
##               Group.1       x
## 8         Dinophyceae 2057121
## 13     Prasinophyceae  246726
## 5       Chrysophyceae  217440
## 11      Pelagophyceae  142764
## 7    Dictyochophyceae  118388
## 6     Cryptomonadales   30359
## 3  Chlorarachniophyta   11722
## 1     Bacillariophyta   10858
## 2       Bolidophyceae    8322
## 12  Pinguiochysidales    5055
## 14   Prymnesiophyceae    1236
## 10    Mamiellophyceae    1170
## 9       Eustigmatales     741
## 15     Raphydophyceae     716
## 4       Chlorophyceae     669
## 17        Ulvophyceae     442
## 16   Trebouxiophyceae     247
##               Group.1     x
## 8         Dinophyceae 12117
## 13     Prasinophyceae   523
## 11      Pelagophyceae   458
## 5       Chrysophyceae   431
## 7    Dictyochophyceae   276
## 1     Bacillariophyta   137
## 6     Cryptomonadales   105
## 3  Chlorarachniophyta    91
## 14   Prymnesiophyceae    79
## 4       Chlorophyceae    55
## 10    Mamiellophyceae    44
## 17        Ulvophyceae    26
## 12  Pinguiochysidales    25
## 2       Bolidophyceae    19
## 16   Trebouxiophyceae    14
## 9       Eustigmatales    11
## 15     Raphydophyceae     5
##                    reads_per_class OTUs_per_class
## Bacillariophyta              10858            137
## Bolidophyceae                 8322             19
## Chlorarachniophyta           11722             91
## Chlorophyceae                  669             55
## Chrysophyceae               217440            431
##                    reads_per_class OTUs_per_class samples_per_class
## Dinophyceae                2057121          12117               117
## Prasinophyceae              246726            523               117
## Chrysophyceae               217440            431               117
## Pelagophyceae               142764            458               113
## Dictyochophyceae            118388            276               116
## Cryptomonadales              30359            105               103
## Chlorarachniophyta           11722             91               112
## Bacillariophyta              10858            137               112
## Bolidophyceae                 8322             19               108
## Pinguiochysidales             5055             25               111
## Prymnesiophyceae              1236             79                97
## Mamiellophyceae               1170             44                65
## Eustigmatales                  741             11                35
## Raphydophyceae                 716              5                75
## Chlorophyceae                  669             55                75
## Ulvophyceae                    442             26                50
## Trebouxiophyceae               247             14                18
##   reads_per_class    OTUs_per_class samples_per_class 
##           100.000           100.000          1317.094
##                    reads_per_class OTUs_per_class samples_per_class
## Dinophyceae           72.079127505    84.05244173         100.00000
## Prasinophyceae         8.644992109     3.62791343         100.00000
## Chrysophyceae          7.618844727     2.98973363         100.00000
## Pelagophyceae          5.002284532     3.17702553          96.58120
## Dictyochophyceae       4.148177840     1.91453940          99.14530
## Cryptomonadales        1.063744054     0.72835738          88.03419
## Chlorarachniophyta     0.410725248     0.63124306          95.72650
## Bacillariophyta        0.380451693     0.95033296          95.72650
## Bolidophyceae          0.291593202     0.13179800          92.30769
## Pinguiochysidales      0.177121321     0.17341842          94.87179
## Prymnesiophyceae       0.043308003     0.54800222          82.90598
## Mamiellophyceae        0.040995439     0.30521643          55.55556
## Eustigmatales          0.025963778     0.07630411          29.91453
## Raphydophyceae         0.025087807     0.03468368          64.10256
## Chlorophyceae          0.023440982     0.38152053          64.10256
## Ulvophyceae            0.015487166     0.18035516          42.73504
## Trebouxiophyceae       0.008654593     0.09711432          15.38462

OTUs per class vs. samples in which they occur [PLOT DESCRIPTION]

Reads per class vs. OTUs per class [PLOT DESCRIPTION]

## Warning in trans$transform(limits): NaNs produced

2) 16S amplicons

2.1) Data overview

Let’s read the dataset and remove the samples containing less than 5836 reads:

## [1] 960 129
##   OTUId st001 st002 st003 st004
## 1 OTU_1 72073 45381 40792 26317
## 2 OTU_2 11925 13467  6500  4802
## 3 OTU_3    19    12    10    21
## 4 OTU_4  1795  2861  1057   993
## 5 OTU_5  1766  1341   463   499
## Warning in `[<-.factor`(`*tmp*`, thisvar, value = 0): invalid factor level,
## NA generated
## Warning in `[<-.factor`(`*tmp*`, thisvar, value = 0): invalid factor level,
## NA generated
## Warning in `[<-.factor`(`*tmp*`, thisvar, value = 0): invalid factor level,
## NA generated
## Warning in `[<-.factor`(`*tmp*`, thisvar, value = 0): invalid factor level,
## NA generated
## [1] 960 128
##       st001 st002 st003 st004 st005
## OTU_1 72073 45381 40792 26317 65068
## OTU_2 11925 13467  6500  4802 10947
## OTU_3    19    12    10    21     4
## OTU_4  1795  2861  1057   993  2466
## OTU_5  1766  1341   463   499  1912
## [1] 960   6
##                                                                                                        SILVA_taxonomic_classification
## OTU_1   KC002097.1.1314_Bacteria;Cyanobacteria;Cyanobacteria;SubsectionI;FamilyI;Prochlorococcus;unidentified_marine_bacterioplankton
## OTU_2                          EU800388.1.1465_Bacteria;Proteobacteria;Alphaproteobacteria;SAR11_clade;Surface_1;uncultured_bacterium
## OTU_3 KJ590614.1.1421_Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Sulfitobacter;uncultured_bacterium
## OTU_4  EU805005.1.1343_Bacteria;Actinobacteria;Acidimicrobiia;Acidimicrobiales;OM1_clade;Candidatus_Actinomarina;uncultured_bacterium
## OTU_5     KC000519.1.1314_Bacteria;Cyanobacteria;Cyanobacteria;SubsectionI;FamilyI;Synechococcus;unidentified_marine_bacterioplankton
##                                                                                                Phytoref_taxonomic_classification
## OTU_1   EU803052.1.1425|Bacteria|Cyanobacteria|Cyanobacteria|SubsectionI|FamilyI|Prochlorococcus|uncultured|uncultured+bacterium
## OTU_2                                                                                                                       <NA>
## OTU_3                                                                                                                       <NA>
## OTU_4                                                                                                                       <NA>
## OTU_5 AY663935.1.1244|Bacteria|Cyanobacteria|Cyanobacteria|SubsectionI|FamilyI|Synechococcus|uncultured|uncultured+Synechococcus
##       SILVA_funct_classif Phytoref_funct_classif                class_A
## OTU_1              cyanob                 cyanob          Cyanobacteria
## OTU_2            bacteria                   <NA> heterotrophic_bacteria
## OTU_3            bacteria                   <NA> heterotrophic_bacteria
## OTU_4            bacteria                   <NA> heterotrophic_bacteria
## OTU_5              cyanob                 cyanob          Cyanobacteria
##                      class_B
## OTU_1        Prochlorococcus
## OTU_2 heterotrophic_bacteria
## OTU_3 heterotrophic_bacteria
## OTU_4 heterotrophic_bacteria
## OTU_5          Synechococcus
## [1] 960 122
##       st001 st002 st003 st004 st005
## OTU_1 72073 45381 40792 26317 65068
## OTU_2 11925 13467  6500  4802 10947
## OTU_3    19    12    10    21     4
## OTU_4  1795  2861  1057   993  2466
## OTU_5  1766  1341   463   499  1912
## st122 st124 st137 st144 
##  2727  3408   420   624
## [1] 960 118
## [1] 960 117
## [1] 960   0

Table dimensions and content outline:

## [1] 960 117
##       st001 st002 st003 st004 st005
## OTU_1 72073 45381 40792 26317 65068
## OTU_2 11925 13467  6500  4802 10947
## OTU_3    19    12    10    21     4
## OTU_4  1795  2861  1057   993  2466
## OTU_5  1766  1341   463   499  1912

Minimum number of reads per station:

min(colSums(tb16_tax_occur_min5836)) 
## [1] 5836

Maximum number of reads per station:

max(colSums(tb16_tax_occur_min5836)) 
## [1] 156024

Identification of station with higher number of reads:

amplicons_per_sample<-colSums(tb16_tax_occur_min5836)
amplicons_per_sample[which(colSums(tb16_tax_occur_min5836)>150000)]
##  st050 
## 156024

Overall reads per sample:

2.2) Normalization

Let’s normalize the original dataset by randomly subsampling 5836 reads in each station:

library(vegan)
tb16_tax_occur_min5836_t<-t(tb16_tax_occur_min5836)
tb16_tax_occur_ss5836<-rrarefy(tb16_tax_occur_min5836_t, 5836)

The normalized table shows the following dimensions and format:

## [1] 117 960
##       OTU_1 OTU_2 OTU_3 OTU_4 OTU_5
## st001  4063   680     0    91    90
## st002  3473   954     1   214    87
## st003  4334   688     0    86    53
## st004  3881   706     3   155    84
## st005  4139   773     0   140   118

Its content fits with the expected normalization values (5836 reads per station):

rowSums(tb16_tax_occur_ss5836)
## st001 st002 st003 st004 st005 st007 st009 st010 st012 st013 st014 st015 
##  5836  5836  5836  5836  5836  5836  5836  5836  5836  5836  5836  5836 
## st017 st018 st019 st022 st023 st024 st025 st026 st027 st028 st029 st030 
##  5836  5836  5836  5836  5836  5836  5836  5836  5836  5836  5836  5836 
## st032 st033 st034 st035 st037 st038 st039 st040 st041 st043 st044 st045 
##  5836  5836  5836  5836  5836  5836  5836  5836  5836  5836  5836  5836 
## st046 st047 st049 st050 st051 st052 st053 st055 st056 st057 st058 st059 
##  5836  5836  5836  5836  5836  5836  5836  5836  5836  5836  5836  5836 
## st060 st062 st063 st064 st065 st066 st067 st068 st069 st070 st071 st072 
##  5836  5836  5836  5836  5836  5836  5836  5836  5836  5836  5836  5836 
## st073 st074 st075 st076 st077 st078 st079 st081 st082 st083 st085 st086 
##  5836  5836  5836  5836  5836  5836  5836  5836  5836  5836  5836  5836 
## st087 st088 st089 st092 st093 st094 st095 st096 st097 st098 st101 st102 
##  5836  5836  5836  5836  5836  5836  5836  5836  5836  5836  5836  5836 
## st103 st104 st106 st107 st108 st109 st110 st112 st114 st115 st116 st118 
##  5836  5836  5836  5836  5836  5836  5836  5836  5836  5836  5836  5836 
## st119 st120 st121 st125 st126 st127 st128 st129 st130 st131 st132 st133 
##  5836  5836  5836  5836  5836  5836  5836  5836  5836  5836  5836  5836 
## st134 st135 st138 st139 st140 st141 st143 st145 st146 
##  5836  5836  5836  5836  5836  5836  5836  5836  5836

Let’s check out how many OTUs don’t appear in the new table:

length(which(colSums(tb16_tax_occur_ss5836)==0)) 
## [1] 39

There are 54 OTUs that don’t show any occurrence in the normalized data. Let’s remove them from the table and take a look at its final dimensions:

tb16_tax_occur_ss5836_no_cero<-tb16_tax_occur_ss5836[,-(which(colSums(tb16_tax_occur_ss5836)==0))]
dim(tb16_tax_occur_ss5836_no_cero)
## [1] 117 921

Datasets summary: dim(tb16_tax) –> 993 128 dim(tb16_tax_occur) –> 960 122 dim(tb16_tax_occur_ss5836_no_cero) –> 117 909

2.3) General community analysis

2.3.1) Richness and evenness (Shannon index)

Most of the samples take Shannon Index values between 5.5 and 6:

2.3.2) Richness: OTU number

Lowest number of OTUs per sample:

## [1] 136

Maximum number of OTUs per sample:

## [1] 356

In most of the samples, we can identify between 1000 and 1300 OTUs:

plot(sort(OTUs_per_sample_16S_tax_occur_ss5836), pch=19)

boxplot(OTUs_per_sample_16S_tax_occur_ss5836, pch=19)

2.3.3) Index of evenness

2.3.3.1) Pielou’s index

pielou_evenness_16S_tax_occur_ss5836 <- tb16_tax_occur_ss5836_div/log(OTUs_per_sample_16S_tax_occur_ss5836)

The Pielou index (constrained between 0 and 1) takes values closer to 1 as the variation of species proportion in a sample decreases. Most of the samples get values between 0.8 and 0.9, meaning that the numerical composition of different OTUs in a sample is highly similar:

plot(sort(pielou_evenness_16S_tax_occur_ss5836), pch=19)

boxplot(pielou_evenness_16S_tax_occur_ss5836, pch=19)

The OTU_2, with 27292 reads, is the most abundant in the overall dataset:

head(sort(colSums(tb16_tax_occur_ss5836_no_cero), decreasing=T), n=10L)
##   OTU_1   OTU_2   OTU_4   OTU_5   OTU_3   OTU_6 OTU_552  OTU_10  OTU_23 
##  277010   97628   34553   26999   12085   11019    7175    6737    6383 
## OTU_435 
##    6063

Most of the OTUs show very few occurrences; the plot suggests that we will probably be able to identify a significant ammount of rare otus:

plot(log(sort(colSums(tb16_tax_occur_ss5836_no_cero), decreasing=T)), pch=19)

2.3.4) Abundance Models

2.3.4.1) Rank-Abundance or Dominance/Diversity Model (“radfit”)

The OTUs abundance distribution fits relativelly close to log-normal model.

2.3.4.2) Preston’s Lognormal Model

According to Preston’s lognormal model fit into species frequencies groups, we’re missing ~3226 species:

tb16_tax_occur_ss5836_prestonfit<-prestonfit(colSums(tb16_tax_occur_min5836_t))
plot(tb16_tax_occur_ss5836_prestonfit, main="Pooled species")

veiledspec(tb16_tax_occur_ss5836_prestonfit)
## Extrapolated     Observed       Veiled 
##    972.72511    960.00000     12.72511

When computing Prestons’ lognormal model fit without pooling data into groups, we seem to miss ~3055 species:

## Extrapolated     Observed       Veiled 
##   964.490456   960.000000     4.490456

2.3.6) Beta diversity

2.3.6.1) Dissimilarity matrix using Bray-Curtis index:

The Bray-Curtis dissimilarity, constrained between 0 (minimum distance) and 1 (highest dissimilarity) allows us to quantify the differences between samples according to the composition and relative abundance of their OTUs. In our dataset, most of the samples pairs take dissimilarity values between between 0.7 and 0.8, meaning that their composition is substantially different.

2.3.6.2) Hierarchical clustering

The only relatively evident cluster we can distinguish in the dendogram stands out in the very left side of the plot.

(To be done: assign Longhurst provinces information to each station and check if any of the central clusters is meaningful regarding to the samples’ geographical ubication)

2.3.6.3) Non-metric multidimensional scaling

We can identify a prominent group in the central part of the NMDS plot and a few outliers in the middle-left edge of the plot. The stress parameter takes a value below 0.3, suggesting that the plot is acceptable.

## 
## Call:
## monoMDS(dist = tb16_tax_occur_ss5836_no_cero.bray) 
## 
## Non-metric Multidimensional Scaling
## 
## 117 points, dissimilarity 'bray', call 'vegdist(x = tb16_tax_occur_ss5836_no_cero, method = "bray")'
## 
## Dimensions: 2 
## Stress:     0.1330533 
## Stress type 1, weak ties
## Scores scaled to unit root mean square, rotated to principal components
## Stopped after 118 iterations: Stress nearly unchanged (ratio > sratmax)

When implementing a most robut function for computing NMDS plots, the result is quiet the same:

## Run 0 stress 0.1248949 
## Run 1 stress 0.1258697 
## Run 2 stress 0.1245297 
## ... New best solution
## ... procrustes: rmse 0.006778966  max resid 0.05528081 
## Run 3 stress 0.1277239 
## Run 4 stress 0.1255918 
## Run 5 stress 0.1278339 
## Run 6 stress 0.126939 
## Run 7 stress 0.1258749 
## Run 8 stress 0.1297871 
## Run 9 stress 0.1245368 
## ... procrustes: rmse 0.0009401818  max resid 0.007321792 
## *** Solution reached
## Warning in ordiplot(x, choices = choices, type = type, display = display, :
## Species scores not available

2.4) Geographical analysis

Working datasets:

  1. Community matrix: tb16_tax_occur_ss5836_no_cero
dim(tb16_tax_occur_ss5836_no_cero)
## [1] 117 921
tb16_tax_occur_ss5836_no_cero[1:5, 1:5]
##       OTU_1 OTU_2 OTU_3 OTU_4 OTU_5
## st001  4063   680     0    91    90
## st002  3473   954     1   214    87
## st003  4334   688     0    86    53
## st004  3881   706     3   155    84
## st005  4139   773     0   140   118
  1. Community Bray-Curtis: tb16_tax_occur_ss5836_no_cero.bray
## NULL
tb16_tax_occur_ss5836_no_cero.bray<-as.matrix(tb16_tax_occur_ss5836_no_cero.bray)
  1. Stations distances in km: geo_distances_MP_16S
dim(geo_distances_MP_16S)
## [1] 117 117

Communities quickly change their composition across geographical distances:

plot(geo_distances_MP_16S, tb16_tax_occur_ss5836_no_cero.bray, pch=19, cex=0.4, xlab="Geopgraphical distances", ylab="Bray-Curtis dissimilarities")

2.4.1) Mantel correlograms

Mantel statistic is -significantlly- so low, meaning that the correlation between samples dissimilarity and geographical distances is weak.

mantel(geo_distances_MP_16S, tb16_tax_occur_ss5836_no_cero.bray)
## 
## Mantel statistic based on Pearson's product-moment correlation 
## 
## Call:
## mantel(xdis = geo_distances_MP_16S, ydis = tb16_tax_occur_ss5836_no_cero.bray) 
## 
## Mantel statistic r: 0.01288 
##       Significance: 0.199 
## 
## Upper quantiles of permutations (null model):
##    90%    95%  97.5%    99% 
## 0.0230 0.0303 0.0395 0.0543 
## Permutation: free
## Number of permutations: 999

Maximum distance between samples:

## [1] 19500.19

Minimum distance between samples:

## [1] 0

Correlograms:

MP_16s_ss5836_mantel_correl_by_1000km<-mantel.correlog(tb16_tax_occur_ss5836_no_cero.bray, D.geo=geo_distances_MP_16S, break.pts=seq(0,20000, by=1000))
plot(MP_16s_ss5836_mantel_correl_by_1000km)

MP_16s_ss5836_mantel_correl_by_100km<-mantel.correlog(tb16_tax_occur_ss5836_no_cero.bray, D.geo=geo_distances_MP_16S, break.pts=seq(0,20000, by=100))
plot(MP_16s_ss5836_mantel_correl_by_100km)

mantel(tb18_tax_occur_ss8522_no_cero.bray, tb16_tax_occur_ss5836_no_cero.bray)
## 
## Mantel statistic based on Pearson's product-moment correlation 
## 
## Call:
## mantel(xdis = tb18_tax_occur_ss8522_no_cero.bray, ydis = tb16_tax_occur_ss5836_no_cero.bray) 
## 
## Mantel statistic r: 0.6253 
##       Significance: 0.001 
## 
## Upper quantiles of permutations (null model):
##    90%    95%  97.5%    99% 
## 0.0808 0.1084 0.1300 0.1689 
## Permutation: free
## Number of permutations: 999

2.5) Abundance vs. occurence

In the following plot, we can appreciate the OTUs distribution according to their percentage of occurence and relative abundance. The red line keeps up OTUs that occur in more than 80% of the samples, the green line limits regionally rare OTUs (< 0.001%), and the blue one restricts regionally abundant OTUs (> 0.1%).

Regionally abundant OTUs (relative abundance over 0.1%):

##         mean_rabund perc_occur
## OTU_1   0.405690000  100.00000
## OTU_2   0.142979327  100.00000
## OTU_4   0.050603973  100.00000
## OTU_5   0.039540899  100.00000
## OTU_3   0.017698869   92.30769
## OTU_6   0.016137678  100.00000
## OTU_552 0.010508017   99.14530
## OTU_10  0.009866552  100.00000
## OTU_23  0.009348108   99.14530
## OTU_435 0.008879457  100.00000
## OTU_104 0.008807695  100.00000
## OTU_12  0.007561379  100.00000
## OTU_9   0.007381241  100.00000
## OTU_13  0.006978495  100.00000
## OTU_28  0.006380966  100.00000
## OTU_20  0.006092453   94.01709
## OTU_7   0.005767327   91.45299
## OTU_34  0.005027738   94.87179
## OTU_198 0.004707006  100.00000
## OTU_11  0.004608882   99.14530
## OTU_8   0.004598630   98.29060
## OTU_19  0.004204671  100.00000
## OTU_368 0.004172452   94.87179
## OTU_234 0.004042108   99.14530
## OTU_25  0.003926410  100.00000
## OTU_33  0.003450437   96.58120
## OTU_14  0.003391856   99.14530
## OTU_37  0.003296661   98.29060
## OTU_40  0.003242474   76.06838
## OTU_21  0.003226364   99.14530
## OTU_29  0.003060872   99.14530
## OTU_15  0.002930528   93.16239
## OTU_24  0.002756249   83.76068
## OTU_45  0.002631764  100.00000
## OTU_16  0.002501421   75.21368
## OTU_709 0.002489704   90.59829
## OTU_27  0.002316890   82.90598
## OTU_102 0.002255379   98.29060
## OTU_36  0.002250986   94.01709
## OTU_39  0.002195333   99.14530
## OTU_70  0.002168972  100.00000
## OTU_724 0.002139681  100.00000
## OTU_240 0.002034235   82.90598
## OTU_139 0.002019590   94.87179
## OTU_166 0.001999086   96.58120
## OTU_35  0.001990299   85.47009
## OTU_44  0.001985905   93.16239
## OTU_18  0.001980047   95.72650
## OTU_17  0.001947828   65.81197
## OTU_163 0.001884853   97.43590
## OTU_654 0.001873136   94.01709
## OTU_22  0.001829200   75.21368
## OTU_694 0.001796981   91.45299
## OTU_55  0.001725219   89.74359
## OTU_90  0.001675425   99.14530
## OTU_177 0.001675425   86.32479
## OTU_26  0.001646134   71.79487
## OTU_32  0.001635882   81.19658
## OTU_63  0.001572907   78.63248
## OTU_43  0.001562656   66.66667
## OTU_41  0.001530436   92.30769
## OTU_49  0.001410344   63.24786
## OTU_451 0.001401557   93.16239
## OTU_247 0.001400093   97.43590
## OTU_56  0.001392770   88.03419
## OTU_62  0.001378125   90.59829
## OTU_42  0.001353228   98.29060
## OTU_31  0.001241923   40.17094
## OTU_93  0.001225813   93.16239
## OTU_84  0.001221420   91.45299
## OTU_98  0.001211168   88.03419
## OTU_61  0.001177484   14.52991
## OTU_46  0.001155516   64.95726
## OTU_52  0.001124761   51.28205
## OTU_404 0.001121831   98.29060
## OTU_89  0.001121831   94.87179
## OTU_48  0.001076431   79.48718
## OTU_147 0.001074966   95.72650
## OTU_103 0.001058857   77.77778
## OTU_74  0.001017850   64.10256
## OTU_67  0.001017850   47.00855
## OTU_66  0.001016385   66.66667
## OTU_59  0.001006133   35.04274
dim(tb16_ss5836_abundant_sorted)
## [1] 83  2

Proportion of regionally abundant OTUs (%):

## [1] 0.6430868

Cosmopolitan OTUs (relative abundance over 0.1% and occurence in more than 80% of samples):

##         mean_rabund perc_occur
## OTU_1   0.405690000  100.00000
## OTU_2   0.142979327  100.00000
## OTU_4   0.050603973  100.00000
## OTU_5   0.039540899  100.00000
## OTU_6   0.016137678  100.00000
## OTU_10  0.009866552  100.00000
## OTU_435 0.008879457  100.00000
## OTU_104 0.008807695  100.00000
## OTU_12  0.007561379  100.00000
## OTU_9   0.007381241  100.00000
## OTU_13  0.006978495  100.00000
## OTU_28  0.006380966  100.00000
## OTU_198 0.004707006  100.00000
## OTU_19  0.004204671  100.00000
## OTU_25  0.003926410  100.00000
## OTU_45  0.002631764  100.00000
## OTU_70  0.002168972  100.00000
## OTU_724 0.002139681  100.00000
## OTU_552 0.010508017   99.14530
## OTU_23  0.009348108   99.14530
## OTU_11  0.004608882   99.14530
## OTU_234 0.004042108   99.14530
## OTU_14  0.003391856   99.14530
## OTU_21  0.003226364   99.14530
## OTU_29  0.003060872   99.14530
## OTU_39  0.002195333   99.14530
## OTU_90  0.001675425   99.14530
## OTU_8   0.004598630   98.29060
## OTU_37  0.003296661   98.29060
## OTU_102 0.002255379   98.29060
## OTU_42  0.001353228   98.29060
## OTU_404 0.001121831   98.29060
## OTU_163 0.001884853   97.43590
## OTU_247 0.001400093   97.43590
## OTU_33  0.003450437   96.58120
## OTU_166 0.001999086   96.58120
## OTU_18  0.001980047   95.72650
## OTU_147 0.001074966   95.72650
## OTU_34  0.005027738   94.87179
## OTU_368 0.004172452   94.87179
## OTU_139 0.002019590   94.87179
## OTU_89  0.001121831   94.87179
## OTU_20  0.006092453   94.01709
## OTU_36  0.002250986   94.01709
## OTU_654 0.001873136   94.01709
## OTU_15  0.002930528   93.16239
## OTU_44  0.001985905   93.16239
## OTU_451 0.001401557   93.16239
## OTU_93  0.001225813   93.16239
## OTU_3   0.017698869   92.30769
## OTU_41  0.001530436   92.30769
## OTU_7   0.005767327   91.45299
## OTU_694 0.001796981   91.45299
## OTU_84  0.001221420   91.45299
## OTU_709 0.002489704   90.59829
## OTU_62  0.001378125   90.59829
## OTU_55  0.001725219   89.74359
## OTU_56  0.001392770   88.03419
## OTU_98  0.001211168   88.03419
## OTU_177 0.001675425   86.32479
## OTU_35  0.001990299   85.47009
## OTU_24  0.002756249   83.76068
## OTU_27  0.002316890   82.90598
## OTU_240 0.002034235   82.90598
## OTU_32  0.001635882   81.19658
dim(otu_tb16_ss5836_cosmop_sorted)
## [1] 65  2

Proportion of cosmopolitan OTUs (%):

## [1] 0.254059

Number and proportion (%) of rare OTUs:

dim(otu_tb16_ss5836_rabund_percoccur[otu_tb16_ss5836_rabund_percoccur$mean_rabund < 0.00001 & otu_tb16_ss5836_rabund_percoccur$mean_rabund >0,])
## [1] 217   2
## [1] 65.70204

2.6) Taxonomic composition analysis

2.6.1) Normalized data

2.6.1.1) Absolute values

Let’s add the taxonomic classification to the left OTUs by merging “tb16_tax_occur_ss5836_no_cero” with “tb16_tax”.

## [1] 921 124
##   Row.names st001 st002 st003 st004
## 1     OTU_1  4063  3473  4334  3881
## 2    OTU_10    16    36    14    21
## 3   OTU_100     0     0     0     0
## 4  OTU_1001     0     0     0     0
## 5  OTU_1004     0     0     0     0
## [1] 921 123
##          st001 st002 st003 st004 st005
## OTU_1     4063  3473  4334  3881  4139
## OTU_10      16    36    14    21    11
## OTU_100      0     0     0     0     0
## OTU_1001     0     0     0     0     0
## OTU_1004     0     0     0     0     0
## [1] 921 124
##                      class_A                class_B OTU_no
## OTU_1          Cyanobacteria        Prochlorococcus      1
## OTU_2 heterotrophic_bacteria heterotrophic_bacteria      2
## OTU_3 heterotrophic_bacteria heterotrophic_bacteria      3
## OTU_4 heterotrophic_bacteria heterotrophic_bacteria      4
## OTU_5          Cyanobacteria          Synechococcus      5
## [1]  87 124
## [1] 834 124
##         st146
## OTU_54      3
## OTU_57     17
## OTU_58      1
## OTU_80      0
## OTU_114     0
##                                                                SILVA_taxonomic_classification
## OTU_54                EU237456.1.1310_Bacteria;Cyanobacteria;Chloroplast;uncultured_bacterium
## OTU_57   AY664138.1.1204_Bacteria;Cyanobacteria;Chloroplast;uncultured_phototrophic_eukaryote
## OTU_58                                                                                   <NA>
## OTU_80  HM565911.2962.4423_Bacteria;Cyanobacteria;Chloroplast;uncultured_prymnesiophyte_C8704
## OTU_114               JN986350.1.1460_Bacteria;Cyanobacteria;Chloroplast;uncultured_bacterium
##                                                                                                                                                              Phytoref_taxonomic_classification
## OTU_54  5523_AB847984|Eukaryota|Hacrobia|Haptophyta|Prymnesiophyceae|Prymnesiophyceae_X|Prymnesiophyceae_XX|Prymnesiophyceae_XXX|Braarudosphaeraceae|Braarudosphaera|Braarudosphaera_bigelowii
## OTU_57            474_Eukaryota|Stramenopiles|Ochrophyta|Dictyochophyceae|Dictyochophyceae_X|Dictyochophyceae_XX|Dictyochophyceae_XXX|Dictyochophyceae_XXXX|Helicopedinella|Helicopedinella_sp
## OTU_58       432_DQ438491|Eukaryota|Archaeplastida|Chlorophyta|Prasinophyceae|Prasinophyceae_X|Prasino-clade-9|Prasino-clade-9_X|Prasino-clade-9_XX|Prasino-clade-9_XXX|Prasino-clade-9_XXX_sp
## OTU_80                                                                                                                                                                                    <NA>
## OTU_114   891_FJ649309|Eukaryota|Archaeplastida|Chlorophyta|Prasinophyceae|Prasinophyceae_X|Prasino-clade-7|Prasino-clade-7_X|Prasino-clade-7_X-B|Prasino-clade-7_X-B1|Prasino-clade-7_X-B1_sp
##         SILVA_funct_classif Phytoref_funct_classif          class_A
## OTU_54              plastid                plastid Prymnesiophyceae
## OTU_57              plastid                plastid Dictyochophyceae
## OTU_58                 <NA>                plastid   Prasinophyceae
## OTU_80              plastid                   <NA>   other_plastids
## OTU_114             plastid                plastid   Prasinophyceae
##                  class_B OTU_no
## OTU_54  Prymnesiophyceae     54
## OTU_57  Dictyochophyceae     57
## OTU_58    Prasinophyceae     58
## OTU_80    other_plastids     80
## OTU_114   Prasinophyceae    114
##       st146
## OTU_1  1990
## OTU_2   960
## OTU_3    79
## OTU_4   155
## OTU_5   163
##                                                                                                        SILVA_taxonomic_classification
## OTU_1   KC002097.1.1314_Bacteria;Cyanobacteria;Cyanobacteria;SubsectionI;FamilyI;Prochlorococcus;unidentified_marine_bacterioplankton
## OTU_2                          EU800388.1.1465_Bacteria;Proteobacteria;Alphaproteobacteria;SAR11_clade;Surface_1;uncultured_bacterium
## OTU_3 KJ590614.1.1421_Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Sulfitobacter;uncultured_bacterium
## OTU_4  EU805005.1.1343_Bacteria;Actinobacteria;Acidimicrobiia;Acidimicrobiales;OM1_clade;Candidatus_Actinomarina;uncultured_bacterium
## OTU_5     KC000519.1.1314_Bacteria;Cyanobacteria;Cyanobacteria;SubsectionI;FamilyI;Synechococcus;unidentified_marine_bacterioplankton
##                                                                                                Phytoref_taxonomic_classification
## OTU_1   EU803052.1.1425|Bacteria|Cyanobacteria|Cyanobacteria|SubsectionI|FamilyI|Prochlorococcus|uncultured|uncultured+bacterium
## OTU_2                                                                                                                       <NA>
## OTU_3                                                                                                                       <NA>
## OTU_4                                                                                                                       <NA>
## OTU_5 AY663935.1.1244|Bacteria|Cyanobacteria|Cyanobacteria|SubsectionI|FamilyI|Synechococcus|uncultured|uncultured+Synechococcus
##       SILVA_funct_classif Phytoref_funct_classif                class_A
## OTU_1              cyanob                 cyanob          Cyanobacteria
## OTU_2            bacteria                   <NA> heterotrophic_bacteria
## OTU_3            bacteria                   <NA> heterotrophic_bacteria
## OTU_4            bacteria                   <NA> heterotrophic_bacteria
## OTU_5              cyanob                 cyanob          Cyanobacteria
##                      class_B OTU_no
## OTU_1        Prochlorococcus      1
## OTU_2 heterotrophic_bacteria      2
## OTU_3 heterotrophic_bacteria      3
## OTU_4 heterotrophic_bacteria      4
## OTU_5          Synechococcus      5
class_summary_reads_per_class_16S_protists<-aggregate(rowSums(tb16_protists[1:117]), list(tb16_protists$class_A), sum)
# count the different groups

class_summary_otus_per_class_16S_protists<-aggregate(rowSums(tb16_protists[1:117]), list(tb16_protists$class_A), length)

## READS PER CLASS ##
attach(class_summary_reads_per_class_16S_protists)
class_summary_reads_per_class_16S_protists_order<-class_summary_reads_per_class_16S_protists[order(-x),]
detach(class_summary_reads_per_class_16S_protists)
class_summary_reads_per_class_16S_protists_order
##               Group.1    x
## 10   Prymnesiophyceae 4335
## 7      other_plastids 2149
## 5    Dictyochophyceae 1056
## 9      Prasinophyceae  906
## 6     Mamiellophyceae  358
## 8       Pelagophyceae   69
## 1     Bacillariophyta   49
## 2       Bolidophyceae   35
## 4     Cryptomonadales   26
## 11         Rappemonad   26
## 3  Chlorarachniophyta    3
#fix column names
row.names(class_summary_reads_per_class_16S_protists_order)<-class_summary_reads_per_class_16S_protists_order[,1]
class_summary_reads_per_class_16S_protists_order<-class_summary_reads_per_class_16S_protists_order[c(-1)]
colnames(class_summary_reads_per_class_16S_protists_order)<-c("reads_per_class")


## OTUs PER CLASS ##
attach(class_summary_otus_per_class_16S_protists)
class_summary_otus_per_class_16S_protists_order<-class_summary_otus_per_class_16S_protists[order(-x),]
detach(class_summary_otus_per_class_16S_protists)
class_summary_otus_per_class_16S_protists_order
##               Group.1  x
## 7      other_plastids 31
## 10   Prymnesiophyceae 27
## 9      Prasinophyceae 10
## 5    Dictyochophyceae  7
## 6     Mamiellophyceae  4
## 1     Bacillariophyta  3
## 2       Bolidophyceae  1
## 3  Chlorarachniophyta  1
## 4     Cryptomonadales  1
## 8       Pelagophyceae  1
## 11         Rappemonad  1
row.names(class_summary_otus_per_class_16S_protists_order)<-class_summary_otus_per_class_16S_protists_order[,1]
class_summary_otus_per_class_16S_protists_order<-class_summary_otus_per_class_16S_protists_order[c(-1)]
colnames(class_summary_otus_per_class_16S_protists_order)<-c("OTUs_per_class")
class_summary_reads_per_class_16S_bacteria<-aggregate(rowSums(tb16_bacteria[1:117]), list(tb16_bacteria$class_A), sum)
# count the different groups

class_summary_otus_per_class_16S_bacteria<-aggregate(rowSums(tb16_bacteria[1:117]), list(tb16_bacteria$class_A), length)

## READS PER CLASS ##
attach(class_summary_reads_per_class_16S_bacteria)
class_summary_reads_per_class_16S_bacteria_order<-class_summary_reads_per_class_16S_bacteria[order(-x),]
detach(class_summary_reads_per_class_16S_bacteria)
class_summary_reads_per_class_16S_bacteria_order
##                  Group.1      x
## 2 heterotrophic_bacteria 349867
## 1          Cyanobacteria 323933
#fix column names
row.names(class_summary_reads_per_class_16S_bacteria_order)<-class_summary_reads_per_class_16S_bacteria_order[,1]
class_summary_reads_per_class_16S_bacteria_order<-class_summary_reads_per_class_16S_bacteria_order[c(-1)]
colnames(class_summary_reads_per_class_16S_bacteria_order)<-c("reads_per_class")


## OTUs PER CLASS ##
attach(class_summary_otus_per_class_16S_bacteria)
class_summary_otus_per_class_16S_bacteria_order<-class_summary_otus_per_class_16S_bacteria[order(-x),]
detach(class_summary_otus_per_class_16S_bacteria)
class_summary_otus_per_class_16S_bacteria_order
##                  Group.1   x
## 2 heterotrophic_bacteria 805
## 1          Cyanobacteria  29
row.names(class_summary_otus_per_class_16S_bacteria_order)<-class_summary_otus_per_class_16S_bacteria_order[,1]
class_summary_otus_per_class_16S_bacteria_order<-class_summary_otus_per_class_16S_bacteria_order[c(-1)]
colnames(class_summary_otus_per_class_16S_bacteria_order)<-c("OTUs_per_class")
##         st001 st002 st003 st004 st005
## OTU_80      7     5     0     1     5
## OTU_146     0     3     1     1     3
## OTU_170     0     2     2     3     1
## OTU_182     0     1     0     2     0
## OTU_213     0     0     0     0     0
##         st001 st002 st003 st004 st005
## OTU_80      7     5     0     1     5
## OTU_146     0     3     1     1     3
## OTU_170     0     2     2     3     1
## OTU_182     0     1     0     2     0
## OTU_213     0     0     0     0     0
## [1]  31 117
## [1] 115
## [1] 117
## [1] 96
## [1] 109
## [1] 26
## [1] 26
## [1] 25
## [1] 3
## [1] 14
## [1] 27
## [1] 22
## [1] 117
## [1] 117
##                    reads_per_class OTUs_per_class
## Bacillariophyta                 49              3
## Bolidophyceae                   35              1
## Chlorarachniophyta               3              1
## Cryptomonadales                 26              1
## Dictyochophyceae              1056              7
##                        reads_per_class OTUs_per_class
## Cyanobacteria                   323933             29
## heterotrophic_bacteria          349867            805
## NA                                  NA             NA
## NA.1                                NA             NA
## NA.2                                NA             NA

OTUs per class vs. samples in which they occur - PROTISTS [PLOT DESCRIPTION]

Reads per class vs. OTUs per class - PROTISTS [PLOT DESCRIPTION]

OTUs per class vs. samples in which they occur - PROTISTS vs CYANOBACTERIA [PLOT DESCRIPTION]

Reads per class vs. OTUs per class - PROTISTS vs CYANOBACTERIA [PLOT DESCRIPTION]

## Warning in trans$transform(limits): NaNs produced
## Warning in trans$transform(limits): NaNs produced

2.6.1.2) Relative values

##   reads_per_class    OTUs_per_class samples_per_class 
##          100.0000          100.0000          394.8718
##                    reads_per_class OTUs_per_class samples_per_class
## Prymnesiophyceae       63.16479674      48.214286        100.000000
## Dictyochophyceae       15.38685706      12.500000         97.435897
## Prasinophyceae         13.20122395      17.857143         83.760684
## Mamiellophyceae         5.21637768       7.142857         26.495726
## Pelagophyceae           1.00539123       1.785714         22.222222
## Bacillariophyta         0.71397348       5.357143         17.948718
## Bolidophyceae           0.50998106       1.785714         23.076923
## Cryptomonadales         0.37884307       1.785714         11.111111
## Rappemonad              0.37884307       1.785714         11.111111
## Chlorarachniophyta      0.04371266       1.785714          1.709402

OTUs per class vs. samples in which they occur - PROTISTS [PLOT DESCRIPTION]

Reads per class vs. OTUs per class - PROTISTS [PLOT DESCRIPTION]

OTUs per class vs. samples in which they occur - PROTISTS vs. CYANOBACTERIA [PLOT DESCRIPTION]

##   reads_per_class    OTUs_per_class samples_per_class 
##          100.0000          100.0000          494.8718
##                    reads_per_class OTUs_per_class samples_per_class
## Prymnesiophyceae      1.310475e+00      31.764706        100.000000
## Dictyochophyceae      3.192300e-01       8.235294         97.435897
## Prasinophyceae        2.738848e-01      11.764706         83.760684
## Mamiellophyceae       1.082238e-01       4.705882         26.495726
## Pelagophyceae         2.085878e-02       1.176471         22.222222
## Bacillariophyta       1.481275e-02       3.529412         17.948718
## Bolidophyceae         1.058054e-02       1.176471         23.076923
## Cryptomonadales       7.859829e-03       1.176471         11.111111
## Rappemonad            7.859829e-03       1.176471         11.111111
## Chlorarachniophyta    9.069033e-04       1.176471          1.709402
## Cyanobacteria         9.792531e+01      34.117647        100.000000

## Warning in trans$transform(value): NaNs produced

Reads per class vs. OTUs per class - PROTISTS vs CYANOBACTERIA [PLOT DESCRIPTION]

## Warning in trans$transform(value): NaNs produced

## Warning in trans$transform(limits): NaNs produced
## Warning in trans$transform(limits): NaNs produced

## Warning in trans$transform(limits): NaNs produced

2.6.2) Non-normalized data

## [1] 960 124
##   Row.names st001 st002 st003 st004
## 1     OTU_1 72073 45381 40792 26317
## 2    OTU_10   317   449   135   164
## 3   OTU_100     0     0     0     0
## 4  OTU_1001     0     0     0     0
## 5  OTU_1002     0     0     0     0
## [1] 960 123
##          st001 st002 st003 st004 st005
## OTU_1    72073 45381 40792 26317 65068
## OTU_10     317   449   135   164   170
## OTU_100      0     0     0     0     0
## OTU_1001     0     0     0     0     0
## OTU_1002     0     0     0     0     0
## [1] 960 124
##                      class_A                class_B OTU_no
## OTU_1          Cyanobacteria        Prochlorococcus      1
## OTU_2 heterotrophic_bacteria heterotrophic_bacteria      2
## OTU_3 heterotrophic_bacteria heterotrophic_bacteria      3
## OTU_4 heterotrophic_bacteria heterotrophic_bacteria      4
## OTU_5          Cyanobacteria          Synechococcus      5
## [1]  87 124
## [1] 873 124
##         st146
## OTU_54      4
## OTU_57     26
## OTU_58      1
## OTU_80      0
## OTU_114     0
##                                                                SILVA_taxonomic_classification
## OTU_54                EU237456.1.1310_Bacteria;Cyanobacteria;Chloroplast;uncultured_bacterium
## OTU_57   AY664138.1.1204_Bacteria;Cyanobacteria;Chloroplast;uncultured_phototrophic_eukaryote
## OTU_58                                                                                   <NA>
## OTU_80  HM565911.2962.4423_Bacteria;Cyanobacteria;Chloroplast;uncultured_prymnesiophyte_C8704
## OTU_114               JN986350.1.1460_Bacteria;Cyanobacteria;Chloroplast;uncultured_bacterium
##                                                                                                                                                              Phytoref_taxonomic_classification
## OTU_54  5523_AB847984|Eukaryota|Hacrobia|Haptophyta|Prymnesiophyceae|Prymnesiophyceae_X|Prymnesiophyceae_XX|Prymnesiophyceae_XXX|Braarudosphaeraceae|Braarudosphaera|Braarudosphaera_bigelowii
## OTU_57            474_Eukaryota|Stramenopiles|Ochrophyta|Dictyochophyceae|Dictyochophyceae_X|Dictyochophyceae_XX|Dictyochophyceae_XXX|Dictyochophyceae_XXXX|Helicopedinella|Helicopedinella_sp
## OTU_58       432_DQ438491|Eukaryota|Archaeplastida|Chlorophyta|Prasinophyceae|Prasinophyceae_X|Prasino-clade-9|Prasino-clade-9_X|Prasino-clade-9_XX|Prasino-clade-9_XXX|Prasino-clade-9_XXX_sp
## OTU_80                                                                                                                                                                                    <NA>
## OTU_114   891_FJ649309|Eukaryota|Archaeplastida|Chlorophyta|Prasinophyceae|Prasinophyceae_X|Prasino-clade-7|Prasino-clade-7_X|Prasino-clade-7_X-B|Prasino-clade-7_X-B1|Prasino-clade-7_X-B1_sp
##         SILVA_funct_classif Phytoref_funct_classif          class_A
## OTU_54              plastid                plastid Prymnesiophyceae
## OTU_57              plastid                plastid Dictyochophyceae
## OTU_58                 <NA>                plastid   Prasinophyceae
## OTU_80              plastid                   <NA>   other_plastids
## OTU_114             plastid                plastid   Prasinophyceae
##                  class_B OTU_no
## OTU_54  Prymnesiophyceae     54
## OTU_57  Dictyochophyceae     57
## OTU_58    Prasinophyceae     58
## OTU_80    other_plastids     80
## OTU_114   Prasinophyceae    114
##       st146
## OTU_1  3284
## OTU_2  1540
## OTU_3   108
## OTU_4   236
## OTU_5   257
##                                                                                                        SILVA_taxonomic_classification
## OTU_1   KC002097.1.1314_Bacteria;Cyanobacteria;Cyanobacteria;SubsectionI;FamilyI;Prochlorococcus;unidentified_marine_bacterioplankton
## OTU_2                          EU800388.1.1465_Bacteria;Proteobacteria;Alphaproteobacteria;SAR11_clade;Surface_1;uncultured_bacterium
## OTU_3 KJ590614.1.1421_Bacteria;Proteobacteria;Alphaproteobacteria;Rhodobacterales;Rhodobacteraceae;Sulfitobacter;uncultured_bacterium
## OTU_4  EU805005.1.1343_Bacteria;Actinobacteria;Acidimicrobiia;Acidimicrobiales;OM1_clade;Candidatus_Actinomarina;uncultured_bacterium
## OTU_5     KC000519.1.1314_Bacteria;Cyanobacteria;Cyanobacteria;SubsectionI;FamilyI;Synechococcus;unidentified_marine_bacterioplankton
##                                                                                                Phytoref_taxonomic_classification
## OTU_1   EU803052.1.1425|Bacteria|Cyanobacteria|Cyanobacteria|SubsectionI|FamilyI|Prochlorococcus|uncultured|uncultured+bacterium
## OTU_2                                                                                                                       <NA>
## OTU_3                                                                                                                       <NA>
## OTU_4                                                                                                                       <NA>
## OTU_5 AY663935.1.1244|Bacteria|Cyanobacteria|Cyanobacteria|SubsectionI|FamilyI|Synechococcus|uncultured|uncultured+Synechococcus
##       SILVA_funct_classif Phytoref_funct_classif                class_A
## OTU_1              cyanob                 cyanob          Cyanobacteria
## OTU_2            bacteria                   <NA> heterotrophic_bacteria
## OTU_3            bacteria                   <NA> heterotrophic_bacteria
## OTU_4            bacteria                   <NA> heterotrophic_bacteria
## OTU_5              cyanob                 cyanob          Cyanobacteria
##                      class_B OTU_no
## OTU_1        Prochlorococcus      1
## OTU_2 heterotrophic_bacteria      2
## OTU_3 heterotrophic_bacteria      3
## OTU_4 heterotrophic_bacteria      4
## OTU_5          Synechococcus      5
class_summary_reads_per_class_16S_protists_non.norm<-aggregate(rowSums(tb16_protists_non.norm[1:117]), list(tb16_protists_non.norm$class_A), sum)
# count the different groups

class_summary_otus_per_class_16S_protists_non.norm<-aggregate(rowSums(tb16_protists_non.norm[1:117]), list(tb16_protists_non.norm$class_A), length)

## READS PER CLASS ##
attach(class_summary_reads_per_class_16S_protists_non.norm)
class_summary_reads_per_class_16S_protists_non.norm_order<-class_summary_reads_per_class_16S_protists_non.norm[order(-x),]
detach(class_summary_reads_per_class_16S_protists_non.norm)
class_summary_reads_per_class_16S_protists_non.norm_order
##               Group.1     x
## 10   Prymnesiophyceae 32950
## 7      other_plastids 16065
## 5    Dictyochophyceae  8988
## 9      Prasinophyceae  6579
## 6     Mamiellophyceae  1432
## 1     Bacillariophyta   509
## 8       Pelagophyceae   493
## 2       Bolidophyceae   267
## 4     Cryptomonadales   147
## 11         Rappemonad   133
## 3  Chlorarachniophyta    34
#fix column names
row.names(class_summary_reads_per_class_16S_protists_non.norm_order)<-class_summary_reads_per_class_16S_protists_non.norm_order[,1]
class_summary_reads_per_class_16S_protists_non.norm_order<-class_summary_reads_per_class_16S_protists_non.norm_order[c(-1)]
colnames(class_summary_reads_per_class_16S_protists_non.norm_order)<-c("reads_per_class")


## OTUs PER CLASS ##
attach(class_summary_otus_per_class_16S_protists_non.norm)
class_summary_otus_per_class_16S_protists_non.norm_order<-class_summary_otus_per_class_16S_protists_non.norm[order(-x),]
detach(class_summary_otus_per_class_16S_protists_non.norm)
class_summary_otus_per_class_16S_protists_non.norm_order
##               Group.1  x
## 7      other_plastids 31
## 10   Prymnesiophyceae 27
## 9      Prasinophyceae 10
## 5    Dictyochophyceae  7
## 6     Mamiellophyceae  4
## 1     Bacillariophyta  3
## 2       Bolidophyceae  1
## 3  Chlorarachniophyta  1
## 4     Cryptomonadales  1
## 8       Pelagophyceae  1
## 11         Rappemonad  1
row.names(class_summary_otus_per_class_16S_protists_non.norm_order)<-class_summary_otus_per_class_16S_protists_non.norm_order[,1]
class_summary_otus_per_class_16S_protists_non.norm_order<-class_summary_otus_per_class_16S_protists_non.norm_order[c(-1)]
colnames(class_summary_otus_per_class_16S_protists_non.norm_order)<-c("OTUs_per_class")
class_summary_reads_per_class_16S_bacteria_non.norm<-aggregate(rowSums(tb16_bacteria_non.norm[1:117]), list(tb16_bacteria_non.norm$class_A), sum)
# count the different groups

class_summary_otus_per_class_16S_bacteria_non.norm<-aggregate(rowSums(tb16_bacteria_non.norm[1:117]), list(tb16_bacteria_non.norm$class_A), length)

## READS PER CLASS ##
attach(class_summary_reads_per_class_16S_bacteria_non.norm)
class_summary_reads_per_class_16S_bacteria_non.norm_order<-class_summary_reads_per_class_16S_bacteria_non.norm[order(-x),]
detach(class_summary_reads_per_class_16S_bacteria_non.norm)
class_summary_reads_per_class_16S_bacteria_non.norm_order
##                  Group.1       x
## 2 heterotrophic_bacteria 2883944
## 1          Cyanobacteria 2819377
#fix column names
row.names(class_summary_reads_per_class_16S_bacteria_non.norm_order)<-class_summary_reads_per_class_16S_bacteria_non.norm_order[,1]
class_summary_reads_per_class_16S_bacteria_non.norm_order<-class_summary_reads_per_class_16S_bacteria_non.norm_order[c(-1)]
colnames(class_summary_reads_per_class_16S_bacteria_non.norm_order)<-c("reads_per_class")


## OTUs PER CLASS ##
attach(class_summary_otus_per_class_16S_bacteria_non.norm)
class_summary_otus_per_class_16S_bacteria_non.norm_order<-class_summary_otus_per_class_16S_bacteria_non.norm[order(-x),]
detach(class_summary_otus_per_class_16S_bacteria_non.norm)
class_summary_otus_per_class_16S_bacteria_non.norm_order
##                  Group.1   x
## 2 heterotrophic_bacteria 842
## 1          Cyanobacteria  31
row.names(class_summary_otus_per_class_16S_bacteria_non.norm_order)<-class_summary_otus_per_class_16S_bacteria_non.norm_order[,1]
class_summary_otus_per_class_16S_bacteria_non.norm_order<-class_summary_otus_per_class_16S_bacteria_non.norm_order[c(-1)]
colnames(class_summary_otus_per_class_16S_bacteria_non.norm_order)<-c("OTUs_per_class")
##         st001 st002 st003 st004 st005
## OTU_80    102   130     2     5    56
## OTU_146    21    33    31    39    21
## OTU_170     4    21     7    15     7
## OTU_182     4     8     2     3     1
## OTU_213     0     2     1     2     0
##         st001 st002 st003 st004 st005
## OTU_80    102   130     2     5    56
## OTU_146    21    33    31    39    21
## OTU_170     4    21     7    15     7
## OTU_182     4     8     2     3     1
## OTU_213     0     2     1     2     0
## [1]  31 117
## [1] 117
## [1] 117
## [1] 116
## [1] 116
## [1] 43
## [1] 63
## [1] 69
## [1] 23
## [1] 32
## [1] 62
## [1] 46
## [1] 117
## [1] 117
##                    reads_per_class OTUs_per_class
## Bacillariophyta                509              3
## Bolidophyceae                  267              1
## Chlorarachniophyta              34              1
## Cryptomonadales                147              1
## Dictyochophyceae              8988              7
##                        reads_per_class OTUs_per_class
## Cyanobacteria                  2819377             31
## heterotrophic_bacteria         2883944            842
## NA                                  NA             NA
## NA.1                                NA             NA
## NA.2                                NA             NA
##   reads_per_class    OTUs_per_class samples_per_class 
##          100.0000          100.0000          394.8718
##                    reads_per_class OTUs_per_class samples_per_class
## Prymnesiophyceae       63.94085229      48.214286        100.000000
## Dictyochophyceae       17.44158969      12.500000         97.435897
## Prasinophyceae         12.76682450      17.857143         83.760684
## Mamiellophyceae         2.77885586       7.142857         26.495726
## Bacillariophyta         0.98773578       5.357143         22.222222
## Pelagophyceae           0.95668711       1.785714         17.948718
## Bolidophyceae           0.51812466       1.785714         23.076923
## Cryptomonadales         0.28525964       1.785714         11.111111
## Rappemonad              0.25809206       1.785714         11.111111
## Chlorarachniophyta      0.06597842       1.785714          1.709402

OTUs per class vs. samples in which they occur - PROTISTS [PLOT DESCRIPTION]

Reads per class vs. OTUs per class - PROTISTS [PLOT DESCRIPTION]

OTUs per class vs. samples in which they occur - PROTISTS vs. CYANOBACTERIA [PLOT DESCRIPTION]

##   reads_per_class    OTUs_per_class samples_per_class 
##          100.0000          100.0000          494.8718
##                    reads_per_class OTUs_per_class samples_per_class
## Prymnesiophyceae       1.147720112      31.034483        100.000000
## Dictyochophyceae       0.313071574       8.045977         97.435897
## Prasinophyceae         0.229160869      11.494253         83.760684
## Mamiellophyceae        0.049879672       4.597701         26.495726
## Bacillariophyta        0.017729576       3.448276         22.222222
## Pelagophyceae          0.017172261       1.149425         17.948718
## Bolidophyceae          0.009300190       1.149425         23.076923
## Cryptomonadales        0.005120329       1.149425         11.111111
## Rappemonad             0.004632679       1.149425         11.111111
## Chlorarachniophyta     0.001184294       1.149425          1.709402
## Cyanobacteria         98.205028442      35.632184        100.000000

Reads per class vs. OTUs per class - PROTISTS vs CYANOBACTERIA [PLOT DESCRIPTION]